Robust speaker change detection using Kernel-Gaussian model
نویسندگان
چکیده
This paper introduces and evaluates a novel approach for unsupervised speaker change detection. In many unsupervised speaker change detection algorithms, each audio segment is typically modeled with a multivariate single Gaussian density, where it is assumed that the distribution of the speech features of the segment is Gaussian. However, this assumption is too strong in many cases. Therefore, this paper presents an alternative to the single Gaussian model: Gaussian model in reproducing kernel Hilbert space (RKHS) or Kernel-Gaussian model (KGM). KGM first projects speech features into RKHS via a nonlinear mapping. Then it models the features in RKHS with a Gaussian density. The mapping procedure enables KGM to capture nonlinear structure of speech features. An implementation of KGM is proposed and evaluated. Experiments on different datasets show that better results are achieved by KGM compared to the single Gaussian model.
منابع مشابه
A robust wavelet based profile monitoring and change point detection using S-estimator and clustering
Some quality characteristics are well defined when treated as response variables and are related to some independent variables. This relationship is called a profile. Parametric models, such as linear models, may be used to model profiles. However, in practical applications due to the complexity of many processes it is not usually possible to model a process using parametric models.In these cas...
متن کاملSpeaker Change Detection in Broadcast TV Using Bidirectional Long Short-Term Memory Networks
Speaker change detection is an important step in a speaker diarization system. It aims at finding speaker change points in the audio stream. In this paper, it is treated as a sequence labeling task and addressed by Bidirectional long short term memory networks (Bi-LSTM). The system is trained and evaluated on the Broadcast TV subset from ETAPE database. The result shows that the proposed model ...
متن کاملSpeaker Adaptation for Support Vector Machine based Word Prominence Detection
In this paper we propose a new speaker adaptation method to improve the detection of prominent words in speech. Prosodic cues are difficult to extract, due to the different features different speakers are using to express, for example prominence. To overcome the problem of variation from the pool of speakers used during training and those encountered during deployment, in speech recognition spe...
متن کاملA robust least squares fuzzy regression model based on kernel function
In this paper, a new approach is presented to fit arobust fuzzy regression model based on some fuzzy quantities. Inthis approach, we first introduce a new distance between two fuzzynumbers using the kernel function, and then, based on the leastsquares method, the parameters of fuzzy regression model isestimated. The proposed approach has a suitable performance to<b...
متن کاملA tree-based kernel selection approach to efficient Gaussian mixture model-universal background model based speaker identification
We propose a tree-based kernel selection (TBKS) algorithm as a computationally efficient approach to the Gaussian mixture model–universal background model (GMM–UBM) based speaker identification. All Gaussian components in the universal background model are first clustered hierarchically into a tree and the corresponding acoustic space is mapped into structurally partitioned regions. When identi...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2008